154 research outputs found

    Adapted K-Nearest Neighbors for Detecting Anomalies on Spatio–Temporal Traffic Flow

    Get PDF
    Outlier detection is an extensive research area, which has been intensively studied in several domains such as biological sciences, medical diagnosis, surveillance, and traffic anomaly detection. This paper explores advances in the outlier detection area by finding anomalies in spatio-temporal urban traffic flow. It proposes a new approach by considering the distribution of the flows in a given time interval. The flow distribution probability (FDP) databases are first constructed from the traffic flows by considering both spatial and temporal information. The outlier detection mechanism is then applied to the coming flow distribution probabilities, the inliers are stored to enrich the FDP databases, while the outliers are excluded from the FDP databases. Moreover, a k-nearest neighbor for distance-based outlier detection is investigated and adopted for FDP outlier detection. To validate the proposed framework, real data from Odense traffic flow case are evaluated at ten locations. The results reveal that the proposed framework is able to detect the real distribution of flow outliers. Another experiment has been carried out on Beijing data, the results show that our approach outperforms the baseline algorithms for high-urban traffic flow

    Data Mining-Based Decomposition for Solving the MAXSAT Problem: Toward a New Approach

    Get PDF
    This article explores advances in the data mining arena to solve the fundamental MAXSAT problem. In the proposed approach, the MAXSAT instance is first decomposed and clustered by using data mining decomposition techniques, then every cluster resulting from the decomposition is separately solved to construct a partial solution. All partial solutions are merged into a global one, while managing possible conflicting variables due to separate resolutions. The proposed approach has been numerically evaluated on DIMACS instances and some hard Uniform-Random-3-SAT instances, and compared to state-of-the-art decomposition based algorithms. The results show that the proposed approach considerably improves the success rate, with a competitive computation time that's very close to that of the compared solutions

    Machine learning for smart building applications: Review and taxonomy

    Get PDF
    © 2019 Association for Computing Machinery. The use of machine learning (ML) in smart building applications is reviewed in this article. We split existing solutions into two main classes: occupant-centric versus energy/devices-centric. The first class groups solutions that use ML for aspects related to the occupants, including (1) occupancy estimation and identification, (2) activity recognition, and (3) estimating preferences and behavior. The second class groups solutions that use ML to estimate aspects related either to energy or devices. They are divided into three categories: (1) energy profiling and demand estimation, (2) appliances profiling and fault detection, and (3) inference on sensors. Solutions in each category are presented, discussed, and compared; open perspectives and research trends are discussed as well. Compared to related state-of-the-art survey papers, the contribution herein is to provide a comprehensive and holistic review from the ML perspectives rather than architectural and technical aspects of existing building management systems. This is by considering all types of ML tools, buildings, and several categories of applications, and by structuring the taxonomy accordingly. The article ends with a summary discussion of the presented works, with focus on lessons learned, challenges, open and future directions of research in this field

    Frequent itemset mining in big data with effective single scan algorithms

    Get PDF
    © 2013 IEEE. This paper considers frequent itemsets mining in transactional databases. It introduces a new accurate single scan approach for frequent itemset mining (SSFIM), a heuristic as an alternative approach (EA-SSFIM), as well as a parallel implementation on Hadoop clusters (MR-SSFIM). EA-SSFIM and MR-SSFIM target sparse and big databases, respectively. The proposed approach (in all its variants) requires only one scan to extract the candidate itemsets, and it has the advantage to generate a fixed number of candidate itemsets independently from the value of the minimum support. This accelerates the scan process compared with existing approaches while dealing with sparse and big databases. Numerical results show that SSFIM outperforms the state-of-the-art FIM approaches while dealing with medium and large databases. Moreover, EA-SSFIM provides similar performance as SSFIM while considerably reducing the runtime for large databases. The results also reveal the superiority of MR-SSFIM compared with the existing HPC-based solutions for FIM using sparse and big databases

    Cluster-based information retrieval using pattern mining

    Get PDF
    This paper addresses the problem of responding to user queries by fetching the most relevant object from a clustered set of objects. It addresses the common drawbacks of cluster-based approaches and targets fast, high-quality information retrieval. For this purpose, a novel cluster-based information retrieval approach is proposed, named Cluster-based Retrieval using Pattern Mining (CRPM). This approach integrates various clustering and pattern mining algorithms. First, it generates clusters of objects that contain similar objects. Three clustering algorithms based on k-means, DBSCAN (Density-based spatial clustering of applications with noise), and Spectral are suggested to minimize the number of shared terms among the clusters of objects. Second, frequent and high-utility pattern mining algorithms are performed on each cluster to extract the pattern bases. Third, the clusters of objects are ranked for every query. In this context, two ranking strategies are proposed: i) Score Pattern Computing (SPC), which calculates a score representing the similarity between a user query and a cluster; and ii) Weighted Terms in Clusters (WTC), which calculates a weight for every term and uses the relevant terms to compute the score between a user query and each cluster. Irrelevant information derived from the pattern bases is also used to deal with unexpected user queries. To evaluate the proposed approach, extensive experiments were carried out on two use cases: the documents and tweets corpus. The results showed that the designed approach outperformed traditional and cluster-based information retrieval approaches in terms of the quality of the returned objects while being very competitive in terms of runtime.publishedVersio

    Emergent Deep Learning for Anomaly Detection in Internet of Everything

    Get PDF
    This research presents a new generic deep learning framework for anomaly detection in the Internet of Everything (IoE). It combines decomposition methods, deep neural networks, and evolutionary computation to better detect outliers in IoE environments. The dataset is first decomposed into clusters, while similar observations in the same cluster are grouped. Five clustering algorithms were used for this purpose. The generated clusters are then trained using Deep Learning architectures. In this context, we propose a new recurrent neural network for training time series data. Two evolutionary computational algorithms are also proposed: the genetic and the bee swarm to fine-tune the training step. These algorithms consider the hyper-parameters of the trained models and try to find the optimal values. The proposed solutions have been experimentally evaluated for two use cases: 1) road traffic outlier detection and 2) network intrusion detection. The results show the advantages of the proposed solutions and a clear superiority compared to state-of-the-art approaches.acceptedVersio

    Vehicle detection using improved region convolution neural network for accident prevention in smart roads

    Get PDF
    This paper explores the vehicle detection problem and introduces an improved regional convolution neural network. The vehicle data (set of images) is first collected, from which the noise (set of outlier images) is removed using the SIFT extractor. The region convolution neural network is then used to detect the vehicles. We propose a new hyper-parameters optimization model based on evolutionary computation that can be used to tune parameters of the deep learning framework. The proposed solution was tested using the well-known boxy vehicle detection data, which contains more than 200,000 vehicle images and 1,990,000 annotated vehicles. The results are very promising and show superiority over many current state-of-the-art solutions in terms of runtime and accuracy performances.publishedVersio

    Intelligent Deep Fusion Network for Anomaly Identification in Maritime Transportation Systems

    Get PDF
    This paper introduces a novel deep learning architecture for identifying outliers in the context of intelligent transportation systems. The use of a convolutional neural network with decomposition is explored to find abnormal behavior in maritime data. The set of maritime data is first decomposed into similar clusters containing homogeneous data, and then a convolutional neural network is used for each data cluster. Different models are trained (one per cluster), and each model is learned from highly correlated data. Finally, the results of the models are merged using a simple but efficient fusion strategy. To verify the performance of the proposed framework, intensive experiments were conducted on marine data. The results show the superiority of the proposed framework compared to the baseline solutions in terms of several accuracy metrics.acceptedVersio

    Penguins Search Optimisation Algorithm for Association Rules Mining

    Get PDF
    Association Rules Mining (ARM) is one of the most popular and well-known approaches for the decision-making process. All existing ARM algorithms are time consuming and generate a very large number of association rules with high overlapping. To deal with this issue, we propose a new ARM approach based on penguins search optimisation algorithm (Pe-ARM for short). Moreover, an efficient measure is incorporated into the main process to evaluate the amount of overlapping among the generated rules. The proposed approach also ensures a good diversification over the whole solutions space. To demonstrate the effectiveness of the proposed approach, several experiments have been carried out on different datasets and specifically on the biological ones. The results reveal that the proposed approach outperforms the well-known ARM algorithms in both execution time and solution quality

    Hybrid RESNET and Regional Convolution Neural Network Framework for Accident Estimation in Smart Roads

    Get PDF
    Road safety is tackled and an intelligent deep learning framework is proposed in this work, which includes outlier detection, vehicle detection, and accident estimation. The road state is first collected, while an intelligent filter, based on SIFT extractor and a Chinese restaurant process is used to remove noise. The extended region-based convolution neural network is then applied to identify the closest vehicles to the given driver. The residual network will benefit from the vehicle detection process to make a binary classification on whether the current road state might cause an accident or not. Finally, we propose a novel optimization model for optimizing hyper-parameters in deep learning methodologies by using evolutionary computation. The proposed solution has been tested using benchmark vehicle detection and accident estimation datasets. The results are very promising and show superiority over many current state-of-the-art solutions in terms of runtime and accuracy, where the proposed solution has more than 5% of improved accident estimation rate compared to the conventional methods.acceptedVersio
    corecore